Parsing with Intraclausal Coordination and Clause Detection

نویسنده

  • Domen Marincic
چکیده

Syntactic analysis, i.e., parsing of text is used during various tasks, e.g., machine translation, question answering, etc. The structure of a sentence is represented with a tree. Parsing long sentences is a difficult task. The motivation was to analyze sub-units of the sentence independently, which could improve the overall parsing accuracy. We developed a new parsing algorithm that includes intraclausal coordination and clause detection. Parsing using clause detection was first tried by Abney (1), whose algorithm delimits non-embedded clauses before the complete parse is made. In (2), there is a short description of a rule-based parser where clause identification is included in the parsing process. A detailed description of our new algorithm can be found in (3). To our knowledge, the algorithm is the first to use intraclausal coordination detection in cojunction with clause detection before parsing. The most important contribution is the decrease in the number of parsing errors by 7.1% and 6.4% for Slovene, compared to the Malt (4) and MSTP (5) baseline parsers, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intraclausal Coordination and Clause Detection as a Preprocessing Step to Dependency Parsing

The impact of clause and intraclausal coordination detection to dependency parsing of Slovene is examined. New methods based on machine learning and heuristic rules are proposed for clause and intraclausal coordination detection. They were included in a new dependency parsing algorithm, PACID. For evaluation, Slovene dependency treebank was used. At parsing, 6.4% and 9.2 % relative error reduct...

متن کامل

Parsing With Clause and Intra-clausal Coordination Detection

We present a new dependency parsing algorithm based on the decomposition of large sentences into smaller units such as clauses and intraclausal coordinations. For the identification of these units, new methods combining machine learning techniques and heuristic rules were developed. The algorithm was evaluated on the Slovene dependency treebank text corpus. Compared to the MSTP parser, currentl...

متن کامل

Coordination Boundary Identification with Similarity and Replaceability

We propose a neural network model for coordination boundary detection. Our method relies on two common properties — similarity and replaceability in conjuncts — in order to detect both similar and dissimilar pairs of conjuncts. The model improves the identification of clause-level coordination using bidirectional recurrent neural networks incorporating two properties as features. We show that o...

متن کامل

Relative clause attachment ambiguity resolution in Persian

The present study seeks to find the way Persian native speakers resolve relative clause attachment ambiguities in sentences containing a complex NP of the type NP of NP followed by a relative clause (RC). Previous off-line studies have found a preference for high attachment in the present study, an on-line technique was used to help identify the nature of this process. Persian speakers were pre...

متن کامل

Relative Clause Ambiguity Resolution in L1 and L2: Are Processing Strategies Transferred?

This study aims at investigating whether Persian native speakers highly advanced in English as a second language (L2ers) can switch to optimal processing strategies in the languages they know and whether working memory capacity (WMC) plays a role in this respect. To this end, using a self-paced reading task, we examined the processing strategies 62 Persian speaking proficient L2ers used to read...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2010